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Abstract 


This document provides a summary of the "Workshop on Internet of 
Things (IoT) Semantic Interoperability (IOTSI)", which took place in 
Santa Clara, California March 17-18, 2016. The main goal of the 
workshop was to foster a discussion on the different approaches used 
by companies and Standards Developing Organizations (SDOs) to 
accomplish interoperability at the application layer. This report 
summarizes the discussions and lists recommendations to the standards 
community. The views and positions in this report are those of the 
workshop participants and do not necessarily reflect those of the 
authors or the Internet Architecture Board (IAB), which organized the 
workshop. Note that this document is a report on the proceedings of 
the workshop. The views and positions documented in this report are 
those of the workshop participants and do not necessarily reflect IAB 
views and positions. 


Status of This Memo 


This document is not an Internet Standards Track specification; it is 
published for informational purposes. 


This document is a product of the Internet Architecture Board (IAB) 
and represents information that the IAB has deemed valuable to 
provide for permanent record. It represents the consensus of the 
Internet Architecture Board (IAB). Documents approved for 
publication by the IAB are not candidates for any level of Internet 
Standard; see Section 2 of RFC 7841. 


Information about the current status of this document, any errata, 
and how to provide feedback on it may be obtained at 
https://www.rfc-editor.org/info/rfc8477. 
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Les 


Introduction 


The Internet Architecture Board (IAB) holds occasional workshops 
designed to consider long-term issues and strategies for the 
Internet, and to suggest future directions for the Internet 
architecture. The investigated topics often require coordinated 
efforts from many organizations and industry bodies to improve an 
identified problem. One of the targets of the workshops is to 
establish communication between relevant organizations, especially 
when the topics are out of the scope of the Internet Engineering Task 
Force (IETF). This long-term planning function of the IAB is 
complementary to the ongoing engineering efforts performed by working 
groups of the IETF. 


With the expansion of the Internet of Things (IoT), interoperability 
becomes more and more important. Standards Developing Organizations 
(SDOs) have done a tremendous amount of work to standardize new 
protocols and profile existing protocols. 


At the application layer and at the level of solution frameworks, 
interoperability is not yet mature. Particularly, the work on data 
formats (in the form of data models and information models) has not 
seen the same level of consistency throughout SDOs. 


One common problem is the lack of an encoding-independent 
standardization of the information, the so-called information model. 
Another problem is the strong relationship between data formats and 
the underlying communication architecture, such as a design in Remote 
Procedure Call (RPC) style or a RESTful design (where REST refers to 
Representational State Transfer). Furthermore, groups develop 
solutions that are very similar on the surface but differ slightly in 
their standardized outcome, leading to interoperability problems. 
Finally, some groups favor different encodings for use with various 
application-layer protocols. 


Thus, the IAB decided to organize a workshop to reach out to relevant 
stakeholders to explore the state of the art and identify commonality 
and gaps [IOTSIAG] [IOTSIWS]. In particular, the IAB was interested 
to learn about the following aspects: 


o What is the state of the art in data and information models? What 
should an information model look like? 


o What is the role of formal languages, such as schema languages, in 
describing information and data models? 


o What is the role of metadata, which is attached to data to make it 
self-describing? 
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o How can we achieve interoperability when different organizations, 
companies, and individuals develop extensions? 


o What is the experience with interworking various data models 
developed from different groups, or with data models that evolved 
over time? 


o What functionality should online repositories for sharing schemas 
have? 


o How can existing data models be mapped against each other to offer 
interworking? 


o Is there room for harmonization, or are the use cases of different 
groups and organizations so unique that there is no possibility 
for cooperation? 


o How can organizations better work together to increase awareness 
and information sharing? 


2. Terminology 


The first roadblock to interoperability at the level of data models 
is the lack of a common vocabulary to start the discussion. 

[RFC3444] provides a starting point by separating conceptual models 
for designers, or "information models", from concrete detailed 
definitions for implementers, or "data models". There are concepts 
that are undefined in that RFC and elsewhere, such as the interaction 
with the resources of an endpoint, or "interaction model". 

Therefore, the three "main" common models that were identified were: 


Information Model 
An information model defines an environment at the highest level 
of abstraction and expresses the desired functionality. 
Information models can be defined informally (e.g., in prose) or 
more formally (e.g., Unified Modeling Language (UML), Entity- 
Relationship Diagrams, etc.). Implementation details are hidden. 


Data Model 
A data model defines concrete data representations at a lower 
level of abstraction, including implementation- and protocol- 
specific details. Some examples are SNMP Management Information 
Base (MIB) modules, World Wide Web Consortium (W3C) Thing 
Description (TD) Things, YANG modules, Lightweight Machine-to- 
Machine (LwM2M) Schemas, Open Connectivity Foundation (OCF) 
Schemas, and so on. 
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Interaction Model 
An interaction model defines how data is accessed and retrieved 
from the endpoints, being, therefore, tied to the specific 
communication pattern that the system has (e.g., REST methods, 
Publish/Subscribe operations, or RPC calls). 


Another identified terminology issue is the semantic meaning overload 
that some terms have. The meaning can vary depending on the context 
in which the term is used. Some examples of such terms are as 
follows: semantics, models, encoding, serialization format, media 
types, and encoding types. Due to time constraints, no concrete 
terminology was agreed upon, but work will continue within each 
organization to create various terminology documents. The 
participants agreed to set up a GitHub repository [IOTSIGIT] for 
sharing information. 


3. What Problems to Solve 


The participants agreed that there is not simply a single problem to 
be solved but rather a range of problems. During the workshop, the 
following problems were discussed: 


o Formal Languages for Documentation Purposes 


To simplify review and publication, SDOs need formal descriptions of 
their data and interaction models. Several of them use a tabular 
representation found in the specification itself but use a formal 
language as an alternative way of describing objects and resources 
for formal purposes. Some examples of formal language use are as 
follows. 


The Open Mobile Alliance (OMA), now OMA SpecWorks, used an XML Schema 
[LWM2M-Schema] to describe their object and resource definitions. 

The XML files of standardized objects are available for download at 
[OMNA]. 


The Bluetooth Special Interest Group (SIG) defined Generic Attribute 
Profile (GATT) services and characteristics for use with Bluetooth 
Smart/Low Energy. The services and characteristics are shown in a 
tabular form on the Bluetooth SIG website [SIG] and are defined as 
XML instance documents. 


The Open Connectivity Foundation (OCF) uses JSON Schemas to formally 
define data models and RESTful API Modeling Language (RAML) to define 
interaction models. The standard files are available online at 
<oneloTa.org>. 
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The AllSeen Alliance uses AllJoyn Introspection XML to define data 
and interaction models in the same formal language, tailored for 
RPC-style interaction. The standard files are available online on 
the AllSeen Alliance website, but both standard and vendor-defined 
model files can be obtained by directly querying a device for them at 
runtime. 


The World Wide Web Consortium (W3C) uses the Resource Description 
Framework (RDF) to define data and interaction models using a format 
tailored for the web. 


The Internet Engineering Task Force (IETF) uses YANG to define data 
and interaction models. Other SDOs may use various other formats. 


o Formal Languages for Code Generation 


Code-generation tools that use formal data and information modeling 
languages are needed by developers. For example, the AllSeen Visual 
Studio Plugin [AllSeen-Plugin] offers a wizard to generate code based 
on the formal description of the data model. Another example of a 
data modeling language that can be used for code generation is YANG. 
A popular tool to help with code generation of YANG modules is pyang 
[PYANG]. An example of a tool that can generate code for multiple 
ecosystems is OpenDOF [OpenDOF]. Use cases discussed for code 
generation included easing development of server-side device 
functionality, clients, and compliance tests. 


o Debugging Support 


Debugging tools are needed that implement generic object browsers, 
which use standard data models and/or retrieve formal language 
descriptions from the devices themselves. As one example, the nRF 
Bluetooth Smart sniffer from Nordic Semiconductor [nRF-Sniffer] can 
be used to display services and characteristics defined by the 
Bluetooth SIG. As another example, AllJoyn Explorer 
[AllJoynExplorer] can be used to browse and interact with any 
resource exposed by an AllJoyn device, including both standard and 
vendor-defined data models, by retrieving the formal descriptions 
from the device at runtime. 


o Translation 
The working assumption is that devices need to have a common data 
model with a priori knowledge of data types and actions. However, 


that would imply that each consortium/organization will try to define 
their own data model. That would cause a major interoperability 


Jimenez, et al. Informational [Page 6] 


RFC 8477 IOTSI Workshop 2016 October 2018 


problem, possibly a completely intractable one given the number of 
variations, extensions, compositions, or versioning changes that will 
happen for each data model. 


Another potential approach is to have a minimal amount of information 
on the device to allow for a runtime binding to a specific model, the 
objective being to require as little prior knowledge as possible. 


Moreover, gateways, bridges and other similar devices need to 
dynamically translate (or map) one data model to another one. 
Complexity will increase as there are also multiple protocols and 
schemas that make interoperability harder to achieve. 


o Runtime Discovery 


Runtime discovery allows IoT devices to exchange metadata about the 
data, potentially along with the data exchanged itself. In some 
cases, the metadata not only describes data but also the interaction 
model as well. An example of such an approach has been shown with 
Hypermedia as the Engine of Application State (HATEOAS) [HATEOAS]. 
Another example is that all AllJoyn devices support such runtime 
discovery using a protocol mechanism called "introspection", where 
the metadata is queried from the device itself [AllSeen]. 


There are various models, whether deployed or possible, for such 
discovery. The metadata might be extracted from a specification, 
looked up on a cloud repository (e.g., oneloTa for OCF models), 
looked up via a vendor’s site, or obtained from the device itself 
(such as in the AllJoyn case). The relevant metadata might be 
obtained from the same place or different pieces might be obtained 
from different places, such as separately obtaining (a) syntax 
information, (b) end-user descriptions in a desired language, and (c) 
developer-specific comments for implementers. 


4. Translation 


In an ideal world where organizations and companies cooperate and 
agree on a single data model standard, there is no need for gateways 
that translate from one data model to another one. However, this is 
far from reality today, and there are many proprietary data models in 
addition to the already standardized ones. As a consequence, 
gateways are needed to translate between data models. This leads to 
(n*2)-n combinations, in the worst case. 


There are analogies with gateways back in the 1980s that were used to 
translate between network layer protocols. Eventually, IP took over, 
providing the necessary end-to-end interoperability at the network 

layer. Unfortunately, the introduction of gateways leads to the loss 
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of expressiveness due to the translation between data models. The 
functionality of IP was so valuable in the market that advanced 
features of other networking protocols became less attractive and 


were not used anymore. 


Participants discussed an alternative that they called a "red star", 
shown in Figure 1, where data models are translated to a common data 


model shown in the middle. 
that are needed down to 2n 


This reduces the number of translations 


(in the best case). The problem, of 


course, is that everyone wants their own data model to be the red 


star in the center. 
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"Red Star" in Data/Information Models 


While the workshop itself was not a suitable forum to discuss the 


design of such translation in detail, 


o Do we need a "red star" that does everything, 


several questions were raised: 


or could we design 


something that offers a more restricted functionality? 


o How do we handle loss of data and functionality? 
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o Should data be translated between data models, or should data 
models themselves be translated? 


o How can interaction models be translated? They need to be dealt 
with in addition to the data models. 


o Many (if not all) data and interaction models have some bizarre 
functionality that cannot be translated easily. How can those be 
handled? 


o What limitations are we going to accept in these translations? 


The participants also addressed the question of when translation 
should be done. Two use cases were discussed: 


(a) Design time: A translation between data model descriptions, such 
as translating a YANG module to a RAML/JSON model, can be 
performed once, during design time. A single information model 
might be mapped to a number of different data models. 


(b) Run time: Runtime translation of values in two standard data 
models can only be algorithmically done when the data model on 
one side is algorithmically derived from the data model on the 
other side. This was called a "derived model". It was 
discussed that the availability of runtime discovery Can aid in 
semantic translation, such as when a vendor-specific data model 
on one side of a protocol bridge is resolved and the translator 
can algorithmically derive the semantically equivalent vendor- 
specific data model on the other side. This situation is 
discussed in [BridgeTaxonomy]. 


The participants agreed that algorithm translation will generally 
require custom code whenever one is translating to anything other 
than a derived model. 


Participants concluded that it is typically easier to translate data 
between systems that follow the same communication architecture. 


5. Dealing with Change 


A large part of the workshop was dedicated to the evolution of 
devices and server-side applications. Interactions between devices 
and services and how their relationship evolves over time is 
complicated by their respective interaction models. 


The workshop participants discussed various approaches to deal with 
change. In the most basic case, a developer might use a description 
of an API and implement the protocol steps. Sometimes, the data or 
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information model can be used to generate code stubs. Subsequent 
changes to an API require changes on the clients to upgrade to the 
new version, which requires some development of new code to satisfy 
the needs of the new API. 


These interactions could be made machine understandable in the first 
place, enabling for changes to happen at runtime. In that scenario, 
a machine client could discover the possible interactions with a 
service, adapting to changes as they occur without specific code 
being developed to adapt to them. 


The challenge seems to be to code the human-readable specification 
into a machine-readable format. Machine-readable languages require a 
shared vocabulary to give meaning to the tags. 


These types of interactions are often based on the REST architectural 
style. Its principle is that a device or endpoint only needs a 
single entry point, with a host providing descriptions of the API 
in-band by means of web links and forms. 


By defining IoT-specific relation types, it is possible to drive 
interactions through links instead of hard-coding URIs into a RESTful 
client, thus making the system flexible enough for later changes. 

The definition of the basic hypermedia formats for IoT is still a 
work in progress. However, some of the existing mechanisms can be 
reused, such as resource discovery, forms, or links. 


6. IANA Considerations 
This document has no IANA actions. 
7. Security Considerations 


There were two types of security considerations discussed: use of 
formal data models for security configuration and security of data 
and data models in general. 


It was observed that the security assumptions and configuration, or 
"security model", varies by ecosystem today, making the job of a 
translator difficult. For example, there are different types of 
security principals (e.g., user vs. device vs. application), the use 
of Access Control Lists (ACLs) versus capabilities, and what types of 
policies can be expressed, all vary by ecosystem. As a result, the 
security model architecture generally dictates where translation can 
be done. 
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One approach discussed was whether two endpoints might be able to use 
some overlay security model across a translator between two 
ecosystems, which only works if the two endpoints agree on a common 
data model for their communication. Another approach discussed was 
simply having a translator act as a trusted intermediary, which 
enables the translator to translate between different data models. 


One suggestion discussed was either adding metadata into the formal 
data model language or having it accompany the data values over the 
wire, tagging the data with privacy levels. However, sometimes even 
the privacy level of information might itself be sensitive. Still, 
it was observed that being able to dynamically learn security 
requirements might help provide better Uls and translators. 


8. Collaboration 


The participants discussed how best to share information among their 
various organizations. One discussion was around having joint 
meetings. One current challenge reported was that organizations were 
not aware of when and where each other's meetings were scheduled, and 
sharing such information could help organizations better collocate 
meetings. To facilitate this exchange, the participants agreed to 
add links to their respective meeting schedules from a common page in 
the IOTSI repository [IOTSIGIT]. 


Another challenge reported was that organizations did not know how to 
find each other’s published data models, and sharing such information 
could better facilitate reuse of the same information model. To 
facilitate this exchange, the participants discussed whether a common 
repository might be used by multiple organizations. The OCF’s 
oneloTa repository was discussed as one possibility, but it was 
reported that its terms of use at the time of the workshop prevented 
this. The OCF agreed to take this back and look at updating the 
terms of use to allow other organizations to use it, as the 
restriction was not the intent. <schema.org> was discussed as 
another possibility. In the meantime, the participants agreed to add 
links to their respective repositories from a common page in the 
IOTSI repository [IOTSIGIT]. 


It was also agreed that the iotsi@iab.org mailing list would remain 
open and available for sharing information between all relevant 
organizations. 
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